The following natural language processing toolkits are popular collections of natural language processing software. They are suites of libraries, frameworks, and applications for symbolic, statistical natural language and speech processing. NLP tools usually perform sentence detection, tokenization, POS-tagging, text chunking, lemmatisation, coreference analysis and resolution, and named-entity detection among others.
Name | Language | License | Creators | Website |
---|---|---|---|---|
AlchemyAPI | C, C++, C#, Java, Python, Perl, Ruby | Free or Commercial | Orchestr8 | [1] |
Antelope framework | C#, VB.net | Free for research | Proxem | [2] |
Apertium | C++, Java | GPL | (various) | [3] |
Cogito | Commercial | Expert System S.p.A. | [4] | |
Carabao Language Kit | Any COM+ compliant language. Customization is via data entry | Commercial with free development tools | Digital Sonata Pty Ltd | [5] |
DELPH-IN | LISP, C++ | LGPL, MIT, ... | Deep Linguistic Processing with HPSG Initiative | [6] |
Distinguo | C++ | Commercial | Ultralingua Inc. | [7] |
Ellogon | C / C++ | LGPL | Georgios Petasis | [8] |
FreeLing | C++ | GPL | Universitat Politècnica de Catalunya | [9] |
General Architecture for Text Engineering | Java | LGPL | GATE open source community | [10] |
Graph Expression | Java | Apache License | Startup huti.ru | [11] |
Learning Based Java | Java | BSD | Cognitive Computation Group at the University of Illinois | [12] |
LingPipe | Java | royalty free or commercial | Alias-i | [13] |
LinguaStream | Java | Free for research | University of Caen, France | [14] |
Mallet | Java | Common Public License | University of Massachusetts Amherst | [15] |
MII nlp toolkit | Java | LGPL | UCLA Medical Imaging Informatics (MII) Group | [16] |
Modular Audio Recognition Framework | Java | BSD | The MARF Research and Development Group, Concordia University | [17] |
MontyLingua | Python, Java | Free for research | MIT | [18] |
Natural Language Toolkit (NLTK) | Python | Apache 2.0 | [19] | |
NooJ (based on INTEX) | .NET Framework-based | Free for research | University of Franche-Comté, France | [20] |
OpenNLP | Java | Apache License 2.0 | Online community | [21] |
Rosette | C, C++, Java, .NET | Commercial | Basis Technology | [22] |
ScalaNLP | Scala | Apache License | David Hall and Daniel Ramage | [23] |
Stanford NLP | Java | GPL | The Stanford Natural Language Processing Group | [24] |
Rasp | C++ | LGPL | University of Cambridge, University of Sussex | [25] |
Natural | Javascript, NodeJs | GPL | Chris Umbel | [26] |
Text Engineering Software Laboratory (Tesla) | Java | Eclipse Public License | University of Cologne | [27] |
Thinktelligence Delegator | Java | Commercial | Thinktelligence Corporation | [28] |
UIMA | Java / C++ | Apache 2.0 | Apache | [29] |
WebLab-project | Java | LGPL | OW2 | [30] |
UniteX | Java & C++ | LGPL | Laboratoire d'Automatique Documentaire et Linguistique | [31] |
The Dragon Toolkit | Java | GPL | Drexel University | [32] |
Factorie | Java | Apache License | University of Massachusetts Amherst | [33] |
Silpa Indic Language Processing Toolkit | Python | AGPL | Silpa opensource community developers | [34] |